Overview

Dataset statistics

Number of variables12
Number of observations2969
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory301.5 KiB
Average record size in memory104.0 B

Variable types

Numeric12

Alerts

gross_revenue is highly overall correlated with qtde_invoice and 5 other fieldsHigh correlation
recency_days is highly overall correlated with qtde_invoiceHigh correlation
qtde_invoice is highly overall correlated with gross_revenue and 2 other fieldsHigh correlation
qtde_items is highly overall correlated with gross_revenue and 5 other fieldsHigh correlation
qtde_products is highly overall correlated with gross_revenue and 3 other fieldsHigh correlation
avg_basket_size is highly overall correlated with gross_revenue and 3 other fieldsHigh correlation
avg_unique_basket_size is highly overall correlated with qtde_productsHigh correlation
freq is highly overall correlated with avg_rec_daysHigh correlation
avg_ticket is highly overall correlated with gross_revenue and 3 other fieldsHigh correlation
avg_rec_days is highly overall correlated with freqHigh correlation
qtde_returns is highly overall correlated with gross_revenue and 3 other fieldsHigh correlation
avg_basket_size is highly skewed (γ1 = 44.67431359)Skewed
avg_ticket is highly skewed (γ1 = 53.44421547)Skewed
qtde_returns is highly skewed (γ1 = 52.70290171)Skewed
customer_id has unique valuesUnique
recency_days has 34 (1.1%) zerosZeros
qtde_returns has 1481 (49.9%) zerosZeros

Reproduction

Analysis started2022-12-19 18:33:20.141687
Analysis finished2022-12-19 18:33:51.919225
Duration31.78 seconds
Software versionpandas-profiling vv3.5.0
Download configurationconfig.json

Variables

customer_id
Real number (ℝ)

Distinct2969
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean15270.773
Minimum12347
Maximum18287
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size46.4 KiB
2022-12-19T15:33:52.166151image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum12347
5-th percentile12619.4
Q113799
median15221
Q316768
95-th percentile17964.6
Maximum18287
Range5940
Interquartile range (IQR)2969

Descriptive statistics

Standard deviation1718.9903
Coefficient of variation (CV)0.11256734
Kurtosis-1.2060947
Mean15270.773
Median Absolute Deviation (MAD)1488
Skewness0.031607859
Sum45338925
Variance2954927.6
MonotonicityNot monotonic
2022-12-19T15:33:52.388850image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
17850 1
 
< 0.1%
17588 1
 
< 0.1%
14905 1
 
< 0.1%
16103 1
 
< 0.1%
14626 1
 
< 0.1%
14868 1
 
< 0.1%
18246 1
 
< 0.1%
17115 1
 
< 0.1%
16611 1
 
< 0.1%
15912 1
 
< 0.1%
Other values (2959) 2959
99.7%
ValueCountFrequency (%)
12347 1
< 0.1%
12348 1
< 0.1%
12352 1
< 0.1%
12356 1
< 0.1%
12358 1
< 0.1%
12359 1
< 0.1%
12360 1
< 0.1%
12362 1
< 0.1%
12364 1
< 0.1%
12370 1
< 0.1%
ValueCountFrequency (%)
18287 1
< 0.1%
18283 1
< 0.1%
18282 1
< 0.1%
18277 1
< 0.1%
18276 1
< 0.1%
18274 1
< 0.1%
18273 1
< 0.1%
18272 1
< 0.1%
18270 1
< 0.1%
18269 1
< 0.1%

gross_revenue
Real number (ℝ)

Distinct2954
Distinct (%)99.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2747.1004
Minimum6.2
Maximum279138.02
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size46.4 KiB
2022-12-19T15:33:52.589335image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum6.2
5-th percentile229.77
Q1570.96
median1084.1
Q32308.06
95-th percentile7219.68
Maximum279138.02
Range279131.82
Interquartile range (IQR)1737.1

Descriptive statistics

Standard deviation10560.058
Coefficient of variation (CV)3.8440742
Kurtosis355.5067
Mean2747.1004
Median Absolute Deviation (MAD)672.05
Skewness16.802797
Sum8156141.2
Variance1.1151482 × 108
MonotonicityNot monotonic
2022-12-19T15:33:52.821072image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
178.96 2
 
0.1%
533.33 2
 
0.1%
889.93 2
 
0.1%
2053.02 2
 
0.1%
745.06 2
 
0.1%
379.65 2
 
0.1%
2092.32 2
 
0.1%
731.9 2
 
0.1%
1353.74 2
 
0.1%
331 2
 
0.1%
Other values (2944) 2949
99.3%
ValueCountFrequency (%)
6.2 1
< 0.1%
13.3 1
< 0.1%
15 1
< 0.1%
36.56 1
< 0.1%
45 1
< 0.1%
52 1
< 0.1%
52.2 1
< 0.1%
52.2 1
< 0.1%
62.43 1
< 0.1%
68.84 1
< 0.1%
ValueCountFrequency (%)
279138.02 1
< 0.1%
259657.3 1
< 0.1%
194550.79 1
< 0.1%
168472.5 1
< 0.1%
136263.72 1
< 0.1%
124564.53 1
< 0.1%
116725.63 1
< 0.1%
91062.38 1
< 0.1%
72882.09 1
< 0.1%
66653.56 1
< 0.1%

recency_days
Real number (ℝ)

HIGH CORRELATION
ZEROS

Distinct272
Distinct (%)9.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean64.288649
Minimum0
Maximum373
Zeros34
Zeros (%)1.1%
Negative0
Negative (%)0.0%
Memory size46.4 KiB
2022-12-19T15:33:53.052806image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile2
Q111
median31
Q381
95-th percentile242
Maximum373
Range373
Interquartile range (IQR)70

Descriptive statistics

Standard deviation77.756171
Coefficient of variation (CV)1.2094852
Kurtosis2.7780386
Mean64.288649
Median Absolute Deviation (MAD)26
Skewness1.7983969
Sum190873
Variance6046.0221
MonotonicityNot monotonic
2022-12-19T15:33:53.253296image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 99
 
3.3%
4 87
 
2.9%
2 85
 
2.9%
3 85
 
2.9%
8 76
 
2.6%
10 67
 
2.3%
9 66
 
2.2%
7 66
 
2.2%
17 64
 
2.2%
22 55
 
1.9%
Other values (262) 2219
74.7%
ValueCountFrequency (%)
0 34
 
1.1%
1 99
3.3%
2 85
2.9%
3 85
2.9%
4 87
2.9%
5 43
1.4%
7 66
2.2%
8 76
2.6%
9 66
2.2%
10 67
2.3%
ValueCountFrequency (%)
373 2
0.1%
372 4
0.1%
371 1
 
< 0.1%
368 1
 
< 0.1%
366 4
0.1%
365 2
0.1%
364 1
 
< 0.1%
360 1
 
< 0.1%
359 1
 
< 0.1%
358 4
0.1%

qtde_invoice
Real number (ℝ)

Distinct57
Distinct (%)1.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.7217918
Minimum1
Maximum206
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size46.4 KiB
2022-12-19T15:33:53.469398image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median4
Q36
95-th percentile17
Maximum206
Range205
Interquartile range (IQR)4

Descriptive statistics

Standard deviation8.847316
Coefficient of variation (CV)1.5462492
Kurtosis190.04523
Mean5.7217918
Median Absolute Deviation (MAD)2
Skewness10.743151
Sum16988
Variance78.275
MonotonicityNot monotonic
2022-12-19T15:33:53.638640image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2 786
26.5%
3 498
16.8%
4 393
13.2%
5 237
 
8.0%
1 190
 
6.4%
6 173
 
5.8%
7 138
 
4.6%
8 98
 
3.3%
9 70
 
2.4%
11 54
 
1.8%
Other values (47) 332
11.2%
ValueCountFrequency (%)
1 190
 
6.4%
2 786
26.5%
3 498
16.8%
4 393
13.2%
5 237
 
8.0%
6 173
 
5.8%
7 138
 
4.6%
8 98
 
3.3%
9 70
 
2.4%
10 54
 
1.8%
ValueCountFrequency (%)
206 1
< 0.1%
198 1
< 0.1%
124 1
< 0.1%
97 1
< 0.1%
91 2
0.1%
86 1
< 0.1%
72 1
< 0.1%
62 2
0.1%
60 1
< 0.1%
57 1
< 0.1%

qtde_items
Real number (ℝ)

Distinct1670
Distinct (%)56.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1606.4187
Minimum1
Maximum196844
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size46.4 KiB
2022-12-19T15:33:53.808039image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile101.4
Q1296
median639
Q31399
95-th percentile4407.4
Maximum196844
Range196843
Interquartile range (IQR)1103

Descriptive statistics

Standard deviation5882.5587
Coefficient of variation (CV)3.6619088
Kurtosis467.23973
Mean1606.4187
Median Absolute Deviation (MAD)420
Skewness17.879512
Sum4769457
Variance34604497
MonotonicityNot monotonic
2022-12-19T15:33:54.008648image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
310 11
 
0.4%
150 9
 
0.3%
88 9
 
0.3%
246 8
 
0.3%
260 8
 
0.3%
288 8
 
0.3%
272 8
 
0.3%
84 8
 
0.3%
134 8
 
0.3%
394 7
 
0.2%
Other values (1660) 2885
97.2%
ValueCountFrequency (%)
1 1
< 0.1%
2 2
0.1%
12 2
0.1%
16 1
< 0.1%
17 1
< 0.1%
18 1
< 0.1%
19 1
< 0.1%
20 1
< 0.1%
23 1
< 0.1%
25 1
< 0.1%
ValueCountFrequency (%)
196844 1
< 0.1%
80997 1
< 0.1%
79879 1
< 0.1%
77373 1
< 0.1%
69993 1
< 0.1%
64549 1
< 0.1%
64124 1
< 0.1%
62812 1
< 0.1%
58243 1
< 0.1%
57772 1
< 0.1%

qtde_products
Real number (ℝ)

Distinct469
Distinct (%)15.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean122.66319
Minimum1
Maximum7837
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size46.4 KiB
2022-12-19T15:33:54.171467image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile9
Q129
median67
Q3135
95-th percentile382
Maximum7837
Range7836
Interquartile range (IQR)106

Descriptive statistics

Standard deviation269.2448
Coefficient of variation (CV)2.1949927
Kurtosis354.42387
Mean122.66319
Median Absolute Deviation (MAD)44
Skewness15.678323
Sum364187
Variance72492.76
MonotonicityNot monotonic
2022-12-19T15:33:54.340833image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
28 46
 
1.5%
20 38
 
1.3%
35 35
 
1.2%
15 33
 
1.1%
29 32
 
1.1%
19 32
 
1.1%
11 32
 
1.1%
26 31
 
1.0%
18 30
 
1.0%
27 30
 
1.0%
Other values (459) 2630
88.6%
ValueCountFrequency (%)
1 6
 
0.2%
2 14
0.5%
3 16
0.5%
4 17
0.6%
5 26
0.9%
6 29
1.0%
7 18
0.6%
8 19
0.6%
9 27
0.9%
10 27
0.9%
ValueCountFrequency (%)
7837 1
< 0.1%
5586 1
< 0.1%
5095 1
< 0.1%
4577 1
< 0.1%
2698 1
< 0.1%
2379 1
< 0.1%
2060 1
< 0.1%
1818 1
< 0.1%
1673 1
< 0.1%
1636 1
< 0.1%

avg_basket_size
Real number (ℝ)

HIGH CORRELATION
SKEWED

Distinct1974
Distinct (%)66.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean249.39271
Minimum1
Maximum40498.5
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size46.4 KiB
2022-12-19T15:33:54.604735image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile44
Q1103.25
median172
Q3281.5
95-th percentile599.52
Maximum40498.5
Range40497.5
Interquartile range (IQR)178.25

Descriptive statistics

Standard deviation791.55574
Coefficient of variation (CV)3.173933
Kurtosis2255.6274
Mean249.39271
Median Absolute Deviation (MAD)82.75
Skewness44.674314
Sum740446.95
Variance626560.48
MonotonicityNot monotonic
2022-12-19T15:33:55.119817image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
100 11
 
0.4%
114 10
 
0.3%
73 9
 
0.3%
82 9
 
0.3%
86 9
 
0.3%
60 8
 
0.3%
140 8
 
0.3%
75 8
 
0.3%
88 8
 
0.3%
163 8
 
0.3%
Other values (1964) 2881
97.0%
ValueCountFrequency (%)
1 2
0.1%
2 1
< 0.1%
3.333333333 1
< 0.1%
5.333333333 1
< 0.1%
5.666666667 1
< 0.1%
6.142857143 1
< 0.1%
7.5 1
< 0.1%
9 1
< 0.1%
9.5 1
< 0.1%
11 1
< 0.1%
ValueCountFrequency (%)
40498.5 1
< 0.1%
6009.333333 1
< 0.1%
4282 1
< 0.1%
3906 1
< 0.1%
3868.65 1
< 0.1%
2880 1
< 0.1%
2801 1
< 0.1%
2733.944444 1
< 0.1%
2518.769231 1
< 0.1%
2160.333333 1
< 0.1%

avg_unique_basket_size
Real number (ℝ)

Distinct1009
Distinct (%)34.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean22.15117
Minimum1
Maximum299.70588
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size46.4 KiB
2022-12-19T15:33:55.339727image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile3.3454545
Q110
median17.2
Q327.75
95-th percentile56.94
Maximum299.70588
Range298.70588
Interquartile range (IQR)17.75

Descriptive statistics

Standard deviation19.512963
Coefficient of variation (CV)0.88089984
Kurtosis27.697928
Mean22.15117
Median Absolute Deviation (MAD)8.2
Skewness3.4988031
Sum65766.825
Variance380.75571
MonotonicityNot monotonic
2022-12-19T15:33:55.571417image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
13 54
 
1.8%
14 40
 
1.3%
11 38
 
1.3%
9 33
 
1.1%
18 33
 
1.1%
1 32
 
1.1%
20 31
 
1.0%
10 30
 
1.0%
16 29
 
1.0%
17 28
 
0.9%
Other values (999) 2621
88.3%
ValueCountFrequency (%)
1 32
1.1%
1.2 1
 
< 0.1%
1.25 1
 
< 0.1%
1.333333333 2
 
0.1%
1.5 8
 
0.3%
1.568181818 1
 
< 0.1%
1.571428571 1
 
< 0.1%
1.666666667 4
 
0.1%
1.833333333 1
 
< 0.1%
2 24
0.8%
ValueCountFrequency (%)
299.7058824 1
< 0.1%
259 1
< 0.1%
203.5 1
< 0.1%
148 1
< 0.1%
145 1
< 0.1%
136.125 1
< 0.1%
135.5 1
< 0.1%
127 1
< 0.1%
122 1
< 0.1%
118 1
< 0.1%

freq
Real number (ℝ)

Distinct1349
Distinct (%)45.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.063268815
Minimum0.0054495913
Maximum3
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size46.4 KiB
2022-12-19T15:33:55.740665image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0.0054495913
5-th percentile0.0094339623
Q10.017777778
median0.029411765
Q30.055401662
95-th percentile0.22222222
Maximum3
Range2.9945504
Interquartile range (IQR)0.037623884

Descriptive statistics

Standard deviation0.13447745
Coefficient of variation (CV)2.1254933
Kurtosis121.57483
Mean0.063268815
Median Absolute Deviation (MAD)0.014338235
Skewness8.7739699
Sum187.84511
Variance0.018084183
MonotonicityNot monotonic
2022-12-19T15:33:55.887851image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.3333333333 21
 
0.7%
0.1666666667 21
 
0.7%
0.02777777778 20
 
0.7%
0.09090909091 19
 
0.6%
0.0625 17
 
0.6%
0.4 16
 
0.5%
0.1333333333 16
 
0.5%
0.03571428571 15
 
0.5%
0.02380952381 15
 
0.5%
0.25 15
 
0.5%
Other values (1339) 2794
94.1%
ValueCountFrequency (%)
0.005449591281 1
 
< 0.1%
0.005464480874 1
 
< 0.1%
0.005494505495 1
 
< 0.1%
0.005509641873 1
 
< 0.1%
0.005586592179 2
0.1%
0.005602240896 1
 
< 0.1%
0.005617977528 2
0.1%
0.00566572238 1
 
< 0.1%
0.005681818182 2
0.1%
0.005698005698 3
0.1%
ValueCountFrequency (%)
3 1
 
< 0.1%
2 1
 
< 0.1%
1.571428571 1
 
< 0.1%
1.5 3
 
0.1%
1 14
0.5%
0.8333333333 1
 
< 0.1%
0.75 1
 
< 0.1%
0.6666666667 12
0.4%
0.6487935657 1
 
< 0.1%
0.6 1
 
< 0.1%

avg_ticket
Real number (ℝ)

HIGH CORRELATION
SKEWED

Distinct2966
Distinct (%)99.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean51.893306
Minimum2.1505882
Maximum56157.5
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size46.4 KiB
2022-12-19T15:33:56.057181image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum2.1505882
5-th percentile4.9166611
Q113.119333
median17.940811
Q324.97963
95-th percentile90.497
Maximum56157.5
Range56155.349
Interquartile range (IQR)11.860296

Descriptive statistics

Standard deviation1036.9345
Coefficient of variation (CV)19.982048
Kurtosis2890.7065
Mean51.893306
Median Absolute Deviation (MAD)5.9641157
Skewness53.444215
Sum154071.22
Variance1075233.2
MonotonicityNot monotonic
2022-12-19T15:33:56.210919image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
15 2
 
0.1%
4.162 2
 
0.1%
14.47833333 2
 
0.1%
18.15222222 1
 
< 0.1%
13.92736842 1
 
< 0.1%
36.24411765 1
 
< 0.1%
29.78416667 1
 
< 0.1%
22.8792623 1
 
< 0.1%
20.51104167 1
 
< 0.1%
149.025 1
 
< 0.1%
Other values (2956) 2956
99.6%
ValueCountFrequency (%)
2.150588235 1
< 0.1%
2.4325 1
< 0.1%
2.462371134 1
< 0.1%
2.511241379 1
< 0.1%
2.515333333 1
< 0.1%
2.65 1
< 0.1%
2.656931818 1
< 0.1%
2.707598253 1
< 0.1%
2.760621572 1
< 0.1%
2.770464191 1
< 0.1%
ValueCountFrequency (%)
56157.5 1
< 0.1%
4453.43 1
< 0.1%
3202.92 1
< 0.1%
1687.2 1
< 0.1%
952.9875 1
< 0.1%
872.13 1
< 0.1%
841.0214493 1
< 0.1%
651.1683333 1
< 0.1%
640 1
< 0.1%
624.4 1
< 0.1%

avg_rec_days
Real number (ℝ)

Distinct1258
Distinct (%)42.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean67.35143
Minimum1
Maximum366
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size46.4 KiB
2022-12-19T15:33:56.389385image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile8
Q125.928571
median48.285714
Q385.333333
95-th percentile201
Maximum366
Range365
Interquartile range (IQR)59.404762

Descriptive statistics

Standard deviation63.542829
Coefficient of variation (CV)0.94345182
Kurtosis4.8877032
Mean67.35143
Median Absolute Deviation (MAD)26.285714
Skewness2.062909
Sum199966.4
Variance4037.6912
MonotonicityNot monotonic
2022-12-19T15:33:56.605577image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
14 25
 
0.8%
4 22
 
0.7%
70 21
 
0.7%
7 20
 
0.7%
35 19
 
0.6%
49 18
 
0.6%
21 17
 
0.6%
46 17
 
0.6%
11 17
 
0.6%
1 16
 
0.5%
Other values (1248) 2777
93.5%
ValueCountFrequency (%)
1 16
0.5%
1.5 1
 
< 0.1%
2 13
0.4%
2.5 1
 
< 0.1%
2.601398601 1
 
< 0.1%
3 15
0.5%
3.321428571 1
 
< 0.1%
3.330357143 1
 
< 0.1%
3.5 2
 
0.1%
4 22
0.7%
ValueCountFrequency (%)
366 1
 
< 0.1%
365 1
 
< 0.1%
363 1
 
< 0.1%
362 1
 
< 0.1%
357 2
0.1%
356 1
 
< 0.1%
355 2
0.1%
352 1
 
< 0.1%
351 2
0.1%
350 3
0.1%

qtde_returns
Real number (ℝ)

HIGH CORRELATION
SKEWED
ZEROS

Distinct173
Distinct (%)5.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean52.255978
Minimum0
Maximum80995
Zeros1481
Zeros (%)49.9%
Negative0
Negative (%)0.0%
Memory size46.4 KiB
2022-12-19T15:33:56.806100image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median1
Q36
95-th percentile63
Maximum80995
Range80995
Interquartile range (IQR)6

Descriptive statistics

Standard deviation1503.4837
Coefficient of variation (CV)28.771515
Kurtosis2833.6407
Mean52.255978
Median Absolute Deviation (MAD)1
Skewness52.702902
Sum155148
Variance2260463.1
MonotonicityNot monotonic
2022-12-19T15:33:57.044466image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 1481
49.9%
1 295
 
9.9%
3 169
 
5.7%
6 93
 
3.1%
2 87
 
2.9%
4 71
 
2.4%
5 43
 
1.4%
12 43
 
1.4%
8 40
 
1.3%
7 38
 
1.3%
Other values (163) 609
20.5%
ValueCountFrequency (%)
0 1481
49.9%
1 295
 
9.9%
2 87
 
2.9%
3 169
 
5.7%
4 71
 
2.4%
5 43
 
1.4%
6 93
 
3.1%
7 38
 
1.3%
8 40
 
1.3%
9 36
 
1.2%
ValueCountFrequency (%)
80995 1
< 0.1%
9014 1
< 0.1%
4824 1
< 0.1%
4027 1
< 0.1%
2302 2
0.1%
1776 1
< 0.1%
1608 1
< 0.1%
1589 1
< 0.1%
1515 1
< 0.1%
1278 1
< 0.1%

Interactions

2022-12-19T15:33:48.810427image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:25.762991image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:27.730412image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:29.735691image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:31.694087image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:33.784207image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:36.105898image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:38.583272image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:40.557825image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:42.494321image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:44.499911image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:46.489740image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:48.979706image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:25.994717image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:28.015517image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:29.873758image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:31.941561image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:33.931363image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:36.290854image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:38.746000image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:40.704975image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:42.641512image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:44.647075image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:46.636882image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:49.242826image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:26.164028image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:28.169199image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:30.036479image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:32.110878image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:34.131927image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:36.538161image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:38.899737image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:40.874359image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:42.795175image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:44.816434image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:46.790578image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:49.458974image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:26.311134image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:28.316350image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:30.190183image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:32.311366image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:34.385789image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:36.769889image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:39.053353image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:41.037086image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:42.942321image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:44.985672image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:47.062844image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:49.659475image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:26.480493image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:28.469970image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:30.337286image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:32.480604image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:34.570644image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:36.954749image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:39.216170image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:41.206365image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:43.080328image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:45.117076image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:47.207503image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:49.844344image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:26.627691image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:28.632700image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:30.537773image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:32.665567image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:34.748997image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:37.177362image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:39.385561image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:41.375624image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:43.381214image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:45.301981image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:47.376736image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:50.013657image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:26.796943image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:28.786438image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:30.722636image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:32.812723image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:34.949526image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:37.590663image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:39.570535image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:41.553976image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:43.566003image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:45.471256image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:47.555222image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:50.283212image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:26.950555image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:28.949170image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:30.891879image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:32.981930image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:35.203423image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:37.749952image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:39.733341image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:41.707634image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:43.728737image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:45.634031image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:47.800957image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:50.461603image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:27.097668image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:29.087251image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:31.061239image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:33.166785image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:35.435143image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:37.912719image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:39.887035image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:41.876963image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:43.898100image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:45.818982image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:48.070792image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:50.630888image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:27.251326image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:29.234437image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:31.208394image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:33.345136image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:35.635663image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:38.066312image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:40.056396image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:42.024145image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:44.045289image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:46.019563image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:48.225065image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:50.800201image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:27.398437image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:29.403641image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:31.377676image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:33.498946image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:35.789335image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:38.213495image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:40.203586image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:42.177800image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:44.183437image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:46.173298image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:48.378689image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:50.963048image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:27.583303image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:29.572916image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:31.524781image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:33.646092image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:35.952203image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:38.398397image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:40.372892image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:42.324993image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:44.330590image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:46.320457image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-19T15:33:48.645930image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Correlations

2022-12-19T15:33:57.244962image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Auto

The auto setting is an interpretable pairwise column metric of the following mapping:
  • Variable_type-Variable_type : Method, Range
  • Categorical-Categorical : Cramer's V, [0,1]
  • Numerical-Categorical : Cramer's V, [0,1] (using a discretized numerical column)
  • Numerical-Numerical : Spearman's ρ, [-1,1]
The number of bins used in the discretization for the Numerical-Categorical column pair can be changed using config.correlations["auto"].n_bins. The number of bins affects the granularity of the association you wish to measure.

This configuration uses the recommended metric for each pair of columns.
2022-12-19T15:33:57.545692image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-12-19T15:33:57.846416image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-12-19T15:33:58.109423image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-12-19T15:33:58.620157image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-12-19T15:33:51.433179image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
A simple visualization of nullity by column.
2022-12-19T15:33:51.718402image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

customer_idgross_revenuerecency_daysqtde_invoiceqtde_itemsqtde_productsavg_basket_sizeavg_unique_basket_sizefreqavg_ticketavg_rec_daysqtde_returns
017850.05391.21372.034.01733.0297.050.9705888.7352940.48611118.15222235.50000021.0
113047.03232.5956.09.01390.0171.0154.44444419.0000000.04878018.90403527.2500006.0
212583.06705.382.015.05028.0232.0335.20000015.4666670.04569928.90250023.18750050.0
313748.0948.2595.05.0439.028.087.8000005.6000000.01792133.86607192.6666670.0
415100.0876.00333.03.080.03.026.6666671.0000000.136364292.0000008.60000022.0
515291.04623.3025.014.02102.0102.0150.1428577.2857140.05444145.32647123.20000027.0
614688.05630.877.021.03621.0327.0172.42857115.5714290.07356917.21978618.300000281.0
717809.05411.9116.012.02057.061.0171.4166675.0833330.03910688.71983635.70000041.0
815311.060767.900.091.038194.02379.0419.71428626.1428570.31550825.5434644.144444231.0
916098.02005.6387.07.0613.067.087.5714299.5714290.02439029.93477647.6666670.0
customer_idgross_revenuerecency_daysqtde_invoiceqtde_itemsqtde_productsavg_basket_sizeavg_unique_basket_sizefreqavg_ticketavg_rec_daysqtde_returns
562717727.01060.2515.01.0645.066.0645.00000066.00.28571416.0643946.06.0
563717232.0421.522.02.0203.036.0101.50000018.00.15384611.70888912.00.0
563817468.0137.0010.02.0116.05.058.0000002.50.40000027.4000004.00.0
564913596.0697.045.02.0406.0166.0203.00000083.00.2500004.1990367.00.0
565514893.01237.859.02.0799.073.0399.50000036.50.66666716.9568492.00.0
565912479.0473.2011.01.0382.030.0382.00000030.00.33333315.7733334.034.0
568014126.0706.137.03.0508.015.0169.3333335.01.00000047.0753333.050.0
568613521.01092.391.03.0733.0435.0244.333333145.00.3000002.5112414.50.0
569615060.0301.848.04.0262.0120.065.50000030.02.0000002.5153331.00.0
571512558.0269.967.01.0196.011.0196.00000011.00.28571424.5418186.0102.0